Cache Coherence Verification with TLA+
نویسندگان
چکیده
We used the specification language TLA+ to analyze the correctness of two cache-coherence protocols for shared-memory multiprocessors based on two generations (EV6 and EV7) of the Alpha processor. A memory model defines the relationship between the values written by one processor and the values read by another, and a cache-coherence protocol manipulates the caches to preserve this relationship. The cache-coherence protocol is a fundamental component of any shared-memory multiprocessor design. Proving that the coherence protocol implements the memory model is a high-leverage application of formal methods. The analysis of the first protocol was largely a research project, but the analysis of the second protocol was a part of the engineers’ own verification process. The EV6-based multiprocessor uses a highly-optimized, very complicated cache-coherence protocol. The protocol uses about sixty different types of messages, and the documentation for the protocol consists of a stack of twenty documents about four inches tall, none of it complete or precise enough to be the basis of a proof. After more than two man-years of effort, four of us were able to write a 1900-line specification of the algorithm, a 200-line specification of the Alpha memory model, and about 3000 lines of proof that the algorithm implements the memory model. This was far from a complete proof, but enough of a proof to subject the algorithm to a rigorous analysis, and to discover one bug in the protocol and one bug in the memory model. The cache-coherence protocol for EV7-based multiprocessors is dramatically simpler, bringing a complete correctness proof within the realm of possibility. A new tool, a model checker for TLA+ called TLC, increased the odds of success. TLC enumerates the reachable states in a finite-state model of a specification written in an expressive subset of TLA+, and it checks that an invariant written in TLA+ holds in each of these states. When TLC discovers an error, a minimallength sequence of states leading from an initial state to a bad state is reported. One of us wrote an 1800-line specification of the algorithm. Using TLC to check multiple invariants uncovered about 66 errors of various kinds. The engineers were also able to use state sequences output by TLC as input to their own RTLverification tools, an interesting case of formal methods helping engineers use their own tools more efficiently. We were pleased to see that the basic verification methodology, refined through years of research, works pretty much as expected, although the proofs were hard. The engineers had little difficulty learning to read and write TLA+ specifications. We hope TLA+ will play a role in other projects in the near future.
منابع مشابه
Checking Cache-Coherence Protocols with TLA+
We have a great deal of experience using the specification language TLA and its model checker TLC to analyze protocols designed at Digital and Compaq (both now part of HP). The tools and techniques we have developed apply equally well to software and hardware designs. In this paper, we describe our experience using TLA and TLC to verify cache-coherence protocols.
متن کاملModel Checking TLA+ Specifications
TLA is a specification language for concurrent and reactive systems that combines the temporal logic TLA with full first-order logic and ZF set theory. TLC is a new model checker for debugging a TLA specification by checking invariance properties of a finite-state model of the specification. It accepts a subclass of TLA specifications that should include most descriptions of real system designs...
متن کاملProofs of Correctness of Cache-Coherence Protocols
We describe two proofs of correctness for Cachet, an adaptive cache-coherence protocol. Each proof demonstrates soundness (conformance to an abstract cache memory model CRF) and liveness. One proof is manual, based on a term-rewriting system de nition; the other is machine-assisted, based on a TLA formulation and using PVS. A twostage presentation of the protocol simpli es the treatment of soun...
متن کاملFormal Verification of a Novel Snooping Cache Coherence Protocol for CMP
The Chip Multiprocessor (CMP) architecture offers dramatically faster retrieval of shared data which is cached on-chip rather than in an off-chip memory. Remote cache requests are handled through a cache coherence protocol. In order to obtain the best possible performance with the CMP architecture, the cache coherence protocol must be optimized to reduce time lost during remote cache and offchi...
متن کامل